Sed: Scalable & Efficient De-duplication File System for Virtual Machine Images

ثبت نشده
چکیده

Virtualization is becoming widely deployed in servers to efficiently provide many logically separate execution environments by reducing the demand for physical servers, so this approach reserves physical CPU resources. Nevertheless, it still consumes large amounts of storage because each virtual machine (VM) instance, needs its own multi-gigabyte disk image. Existing systems take efforts to reduce VM image storage consumption by means of de-duplication within a storage area network (SAN) cluster. Nonetheless, a SAN cannot satisfy the increasing demand of large-scale VM hosting to cloud computing because of its cost limitation. The system proposes a SED (scalable & Efficient De-duplication) file system that has been predominantly designed for large-scale VM consumption. Its design provides hasty VM deployment with peer-to-peer (P2P) data transfer and low storage consumption by means of de-duplication on VM images. It also provides an inclusive set of storage features including on-demand fetching through a network, instant cloning for VM images, and caching with local disks by copy-on-read techniques. Experiments show that proposed system’s features perform well and introduce minor performance overhead. It shows that simply identifying zero-filled block, even in ready-to-use virtual machine disk images available internet can provide considerable savings in storage. Index Terms Cloud computing, de-duplication, file system, peer to peer data transfer, file storage, virtual machine

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experiences with Content Addressable Storage and Virtual Disks

Efficiently managing storage is important for virtualized computing environments. Its importance is magnified by developments such as cloud computing which consolidate many thousands of virtual machines (and their associated storage). The nature of this storage is such that there is a large amount of duplication between otherwise discreet virtual machines. Building upon previous work in content...

متن کامل

Harnessing Metadata Characteristics for Efficient Deduplication in Distributed Storage Systems

As storage capacity requirements grow, storage systems are becoming distributed, and that distribution poses a challenge for space savings processes. In this thesis, I design and implement a mechanism for storing only a single instance of duplicated data within a distributed storage system which selectively performs deduplication across each of the independent computers, known as nodes, used fo...

متن کامل

Prebaked µVMs: Scalable, Instant VM Startup for IaaS Clouds

IaaS clouds promise instantaneously available resources to elastic applications. In practice, however, virtual machine (VM) startup times are in the order of several minutes, or at best, several tens of seconds, negatively impacting the elasticity of applications like Web servers that need to scale out to handle dynamically increasing load. VM startup time is strongly influenced by booting the ...

متن کامل

Live Deduplication Storage of Virtual Machine Images in an Open-Source Cloud

Deduplication is an approach of avoiding storing data blocks with identical content, and has been shown to effectively reduce the disk space for storing multi-gigabyte virtual machine (VM) images. However, it remains challenging to deploy deduplication in a real system, such as a cloud platform, where VM images are regularly inserted and retrieved. We propose LiveDFS, a live deduplication file ...

متن کامل

Reclaiming Space from Duplicate Files in a Serverless Distributed File System

The Farsite distributed file system provides availability by replicating each file onto multiple desktop computers. Since this replication consumes significant storage space, it is important to reclaim used space where possible. Measurement of over 500 desktop file systems shows that nearly half of all consumed space is occupied by duplicate files. We present a mechanism to reclaim space from t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015